Binarization of color document images via luminance and saturation color features

نویسندگان

  • Chun-Ming Tsai
  • Hsi-Jian Lee
چکیده

This paper presents a novel binarization algorithm for color document images. Conventional thresholding methods do not produce satisfactory binarization results for documents with close or mixed foreground colors and background colors. Initially, statistical image features are extracted from the luminance distribution. Then, a decision-tree based binarization method is proposed, which selects various color features to binarize color document images. First, if the document image colors are concentrated within a limited range, saturation is employed. Second, if the image foreground colors are significant, luminance is adopted. Third, if the image background colors are concentrated within a limited range, luminance is also applied. Fourth, if the total number of pixels with low luminance (less than 60) is limited, saturation is applied; else both luminance and saturation are employed. Our experiments include 519 color images, most of which are uniform invoice and name-card document images. The proposed binarization method generates better results than other available methods in shape and connected-component measurements. Also, the binarization method obtains higher recognition accuracy in a commercial OCR system than other comparable methods.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A maximal-information color to gray conversion method for document images: Toward an optimal grayscale representation for document image binarization

A novel method to convert color/multi-spectral images to gray-level images is introduced to increase the performance of document binarization methods. The method uses the distribution of the pixel data of the input document image in a color space to find a transformation, called the dual transform, which balances the amount of information on all color channels. Furthermore, in order to reduce t...

متن کامل

LUMA Based Histogram for Image Enhancement

A luminance based multi scale retinex (LB_MSRCR) algorithm for the enhancement of darker images is proposed in this paper. The new technique consists only the addition of the convolution results of 3 different scales. In this way, the color noise in the shadow/dark areas can be suppressed and the convolutions with different scales can be calculated simultaneously to save CPU time. Color saturat...

متن کامل

Ancient Document Images Enhancement Using Phase Based Binarization

In this paper, we present a phase-based binarization model for degraded document images, also a post processing method that can improve any binarization method and a ground truth generation tool. Usually, many binarization techniques are implemented in the literature for different types of binarization problems. It include an adaptive image contrast based document image binarization technique t...

متن کامل

Text binarization in color documents

This article presents a new method for the binarization of color document images. Initially, the colors of the document image are reduced to a small number using a new color reduction technique. Specifically, this technique estimates the dominant colors and then assigns the original image colors to them in order that the background and text components to become uniform. Each dominant color defi...

متن کامل

Saturation vs. Intensity Distributions of Quality Color Images

No-reference image quality assessment is an image characterization task of sorts. We explore the chromatic characterization of color images and relate it to image quality. Poor image quality is usually related to a loss of contrast; contrast is measured along the color dimensions of luminance, saturation and hue (which is a circular variable). The distribution of saturation as a function of lum...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • IEEE transactions on image processing : a publication of the IEEE Signal Processing Society

دوره 11 4  شماره 

صفحات  -

تاریخ انتشار 2002